
Data preparation

R Code "Data_Process_Combine_Data.R" is the code merging data from multiple sources, including 
1. Satellite measured AOD data ("processed data/AOD_all.csv"), downloaded from NASA (https://search.earthdata.nasa.gov/, we use MOD04_3K from Terra).
2. Weather data (precipitation, temperature, and dew point, "processed data/all_weather.csv"), downloaded from PRISM (https://prism.oregonstate.edu/)
3. EPA inspection data ("ECHO_PP_RID.csv"), obtained from ECHO database (https://echo.epa.gov/tools/data-downloads#exporter).
4. Power plant information ("raw data/PowerPlants_US_EIA_April2019/PowerPlants_US_201904.dbf", combined in the EPA inspection data), downloaded from EIA (https://www.eia.gov/maps/layer_info-m.php). 
5. Power plants registryID (cross-walk power plant ID to EPA ECHO ID, raw data/powerplant_with_registryID.csv), obtained from EPA FRS (https://www.epa.gov/frs)
6. Power plant production and emissions data (raw data/epaampd_q1q4.csv), downloaded from https://ampd.epa.gov/ampd/.
This code produces "Data_Combined.Rdata". Processing time: ~ 5 minutes

Note: EPA inspection data (inspection number, penalty, and source) is not used in the analysis because it does not have daily-level time variation. However, we still keep it in the dataset in case users want to explore further studies.

*******************************************************************************************

We merge AOD, weather, and wind data with power plants based on geographical locations.

Satellite measured AOD data is processed by:
R Code "Data_Process_AOD_Step_1.R"
Total processing time: ~ 14,400 minutes
R Code "Data_Process_AOD_Step_2.R"
Total processing time: ~ 3,600 minutes
These two codes produce "processed data/AOD_all.csv".


Weather data (precipitation, temperature, and dew point) is processed by:
R Code "Data_Process_Weather.R"
This code produces "processed data/all_weather.csv". Total processing time: ~ 14,200 minutes

Wind data is processed by 
R Code "Data_Process_Wind.R"
These two codes produces "processed data/all_weather.csv". processing time: ~ 25 minutes

*******************************************************************************************

R Code "Prepare Final Data.R" is the code for final data cleaning on "Data_Combined.Rdata". It also merges violation penalty source data ("Inspection_Penalty_Source.csv"). 
This code produces "data_final.Rdata" that is ready for analysis. Processing time: ~ 4 minutes

*******************************************************************************************

Total processing time for all data preparation codes: ~ 32,300 minutes

*******************************************************************************************

Analysis:

"data_final.Rdata" is the data set used in all analysis codes listed below.

*******************************************************************************************

BASELINE ANALYSIS CODES:

R Code "Summary Statistics.R" is the summary statistics, produces Table 1.
Processing time: < 1 minute

R Code "Raw Trend" plots Figure 2 and 3. 
Processing time: ~ 1 minute

R Code "Diff_in_Diff.R" is the baseline analysis, produce Table 2 and Appendix Table A1. The results are estimated by "lfe" package version 2.8-7.
Processing time: ~ 1 minute

R Code "Attainment Status Analysis.R" is the analysis by county attainment status. This code use additional data "nonattain.xls" (county nonattainnment status, downloaded from EPA greenback), produces Table 3 and Appendix Table A6. The results are estimated by "lfe" package version 2.8-7.
Processing time: ~ 3 minutes

R Code "Event Study.R" is the event study, produce Figure 4 and Appendix Figure A2. The results are estimated by "lfe" package version 2.8-7.
Processing time: ~ 3 minutes

*******************************************************************************************

ROBUSTNESS CHECKS:

R Code "Robustness Tests.R" is the robustness check analysis that produces Tables 4-6 and Appendix Tables A10-A12. The results are estimated by "lfe" package version 2.8-7.
Processing time: ~ 5 minutes

R Code "Honest_DID.R" analyzes the potential violation of parallel trend, produces Figure 5. The results are estimated by "lfe" package version 3.0-0 (this analysis uses different package version because it is conducted based on the reviewer's suggestion during the third round of revision, 3 years after the initial submission).
Processing time: ~ 7 minutes

R Code "Alternative Control.R" tests the sensitivity of the results to an alternative definition of the control group, produces Table 7 and Appendix Table A15. The results are estimated by "lfe" package version 3.0-0 (this analysis uses different package version because it is conducted based on the reviewer's suggestion during the third round of revision, 3 years after the initial submission).
Processing time: ~ 4 minutes

R Code "Heat Rate Trend.R" plots Figure 7. Heat rate is calculated from CEMS heat input and gross load data downloaded from https://campd.epa.gov/data/custom-data-download
Processing time: ~ 1 minute

*******************************************************************************************

Separate Mechanisms:

R Code "Test Separate Mechanism.R" analyze the potential mechanisms of pollution fluctuation, produces Table 7 and Appendix Table A13-A14. The results are estimated by "lfe" package version 2.8-7.
Processing time: ~ 5 minutes

*******************************************************************************************
Appendix Results:
R Code "Raw Trend Appendix.R" plots Appendix Figure A1. 
Processing time: ~ 1 minute

R Code "Fixed Effect Choices.R" tests the sensitivity of the results to different choices of fixed effects, produces Appendix Tables A2-A5. The results are estimated by "lfe" package version 2.8-7. The results are estimated by "lfe" package version 2.8-7.
Processing time: ~ 10 minutes


R Code "Attainment Status Additional 1" is the robustness check analysis that produces Appendix Tables A7. The results are estimated by "lfe" package version 2.8-7.
Processing time: ~ 5 minutes


R Code "Attainment Status Additional 2" is the robustness check analysis that produces Appendix Tables A8 and A9. The results are estimated by "lfe" package version 2.8-7.
Processing time: ~ 2 minutes


*******************************************************************************************

Total processing time for all analysis codes: ~ 45 minutes

